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Abstract 

Background: Spectral counting methods provide an easy means of identifying proteins with differing abundances 
between complex mixtures using shotgun proteomics data. The crux spectral -counts command, 
implemented as part of the Crux software toolkit, implements four previously reported spectral counting methods, 
the spectral index (Sl/v), the exponentially modified protein abundance index (emPAl), the normalized spectral 
abundance factor (NSAF), and the distributed normalized spectral abundance factor (dNSAF). 

Results: We compared the reproducibility and the linearity relative to each protein's abundance of the four spectral 
counting metrics. Our analysis suggests that NSAF yields the most reproducible counts across technical and biological 
replicates, and both Sl/v and NSAF achieve the best linearity. 

Conclusions: With the crux spectral -counts command, Crux provides open-source modular methods to 
analyze mass spectrometry data for identifying and now quantifying peptides and proteins. The C++ source code, 
compiled binaries, spectra and sequence databases are available at http://noble.gs.washington.edu/proj/crux- 
spectral-counts. 



Background 

Existing methods for differential proteomics (reviewed by 
[1]) fall into two categories: spectral counting methods 
that rely on counting the number of spectra that map to 
a given protein across multiple experiments, and peptide 
chromatographic peak intensity methods that use the area 
under the peptide precursor ion peak as a measure of 
peptide abundance. In principle, methods based on mass 
spectrometry peak areas are potentially much more accu- 
rate, but these methods require highly reproducible liquid 
chromatography as well as accurate methods for chro- 
matographic alignment and identification of peaks within 
the profile spectra. In contrast, spectral counting meth- 
ods are straightforward to employ and have been shown to 
correctly detect known differences between samples [2], 
which contributes to their wide use. 

The command line tool crux spectral -counts 
implements four popular spectral counting methods: the 
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spectral index (Sljv) [3], the exponentially modified pro- 
tein abundance index (emPAI) [4], the normalized spectral 
abundance factor (NSAF) [5], and the distributed normal- 
ized spectral abundance factor (dNSAF) [6]. The crux 
spectral -counts command is integrated within the 
Crux software toolkit, which provides actively main- 
tained open-source methods to identify and now quantify 
peptides and proteins from shotgun mass spectrometry 
datasets. Crux supports a variety of input spectra formats, 
and the tools can easily be incorporated into proteomic 
analysis pipelines, such as the Trans-Proteomic Pipeline 
(TPP) [7]. Finally, the modular design of Crux allows 
improvements to one part of the toolkit to be propagated 
through downstream analyses. 

Currently, several software packages offer spectral 
counting protein quantification methods [8]. ProteoIQ 
(http://www.bioinquire.com) and Scaffold [9] are com- 
mercial software products that post-process results from 
a variety of database search programs. Freely available 
tools such as APEX [10], emPAI calc [11], and PepC [12] 
each offer a single spectral counting method. Table 1 com- 
pares the features of six software spectral counting tools. 
Crux offers more spectral counting methods than other 



O© 201 2 Mcllwain et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative 
BlolVted Cental Commons Attribution License (http://creativecommons.Org/licenses/by/2.0), which permits unrestricted use, distribution, and 
reproduction in any medium, provided the original work is properly cited. 



Mcllwain etal. BMC Bioinformatics 2012, 13:308 
http://www.biomedcentral.eom/1 471 -21 05/1 3/308 



Page 2 of 6 



Table 1 Spectral counting software 
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This table summarizes the features of various spectral counting software 
methods. 



tools and is the only method to provide peptide-level in 
addition to protein-level counts. 

Using crux spectral -counts, we compared and 
contrasted the reproducibility and linearity of the four 
spectral counting methods. Our experiments suggest that 
the NSAF metric provides the most reproducible protein 
quantification. In contrast, our linearity experiments show 
that SIjv and NSAF provide the best performance, with 
dNSAF providing intermediate performance and emPAI 
yielding the worst linearity. 

The contributions of this paper are thus two-fold: we 
describe a performance comparison of the reproducibil- 
ity and linearity of the SI N , emPAI, NSAF, and dNSAF 
protein quantification methods, and we provide to the 
proteomics community a flexible, open source spectral 
counting software tool. 

Implementation 

Software 

The crux spectral -counts command is imple- 
mented as part of the Crux proteomics software toolkit 
[13]. The toolkit is implemented in C++ as a single binary 



that supports commands for database searching and a 
variety of downstream analyses [14-18]. 

The crux spectral -counts command takes as 
input a protein database in FASTA format and a collec- 
tion of peptide-spectrum matches (PSMs) produced by a 
database search procedure. The PSMs may be in Crux's 
tab-delimited text format, PeptideProphets PepXML or 
mzIdentML [19]. To compute the SIjv score, a set of spec- 
tra must also be provided as input in MS2, mzXML, or 
mgf format. By default, crux spectral -counts will 
select the PSMs in the input by a user modifiable threshold 
of q-value < 0.01. 

For each protein with at least one spectral count, the 
program then computes the NSAF, dNSAF, emPAI, or the 
SI/v score. The NSAF metric is defined as 



NSAF N 



sn/Ln 



where N is the protein index, is the number of spectra 
matched to protein N, is the length of protein N, and n 
is the total number of proteins in the input database. 
The dNSAF metric is given by 



S U N+Ylj=ldj,Nsl N 



dNSAF N = 



L N 



E?=i 



where sJJ is the spectral count for the peptides uniquely 
mapping to protein N, s S j N is the spectral count of degen- 
erate peptide ; (out of the proteins k degenerate peptides) 
mapped to protein N, and djjj is the distribution factor of 
peptide shared counts, defined by the equation 



dj,N = 



The metric emPAI is defined as 
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where p°^ servable i s the number of unique peptides observ- 
able for protein N and p observed is the number of unique 
peptides observed for protein N. 
Finally, the SI/v score is calculated using 



SI N = 



E£i(Eti'*) 



where pjq is number of unique peptides in protein N, sj 
is the number of spectra assigned to peptide and is 
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the total fragment ion intensity of spectrum k. Analogous 
scores can also be computed for each peptide, rather than 
for each protein. A detailed description of the peptide- 
level scoring metrics is available in the on-line documen- 
tation. As output, crux spectral -counts produces 
a tab-delimited file listing proteins and their correspond- 
ing counts, in reverse sorted order. 

The crux spectral -counts command also com- 
putes a parsimonious set of proteins, using the greedy set 
cover approach used by IDPicker [20]. Users thus have 
the option of considering spectral counts only for proteins 
within the parsimonious set. 

Data Collection 

For the reproducibility experiments, proteins were 
extracted from the cochlear nucleus of the developing 
mouse brain at postnatal day 7 and postnatal day 21. Two 
biological replicates were generated for each age by dis- 
secting out the cochlear nuclei from two separate mice at 
each age. One of the 21 -day mice was used to generate two 
samples, thereby providing a technical replicate in addi- 
tion to a biological replicate. The samples prepared from 
the chicken brain were derived from nucleus laminaris, 
an auditory region in the brain stem. Samples were taken 
from the dorsal (D) and ventral (V) regions of this area. 
For each region, two biological replicates were generated, 
and one of those replicates was also subjected to techni- 
cal replication. Each sample was digested with trypsin and 
subjected to liquid chromatography followed by tandem 
mass spectrometry. 

For the linearity experiments, we used eight samples 
that represent a dilution curve of 48 known proteins 
synthesized by Sigma (UPS1, http://www.sigmaaldrich. 
com). These data sets are mixtures (Stdl-Std8) of the 
C. elegans lysate at equal concentrations and the 48 pro- 
teins, diluted by a two-fold in each successive standard. 
Std 8 has the lowest concentration of the known pro- 
teins (6 fmol) and Std 1 has the highest concentration 
(870 fmol). 

All three data sets are publicly available at http://noble. 
gs.washington.edu/proj/crux-spectral-counts. 

Data analysis 

The fragmentation spectra from the experiments were 
searched against their respective mouse, chicken, or 
the C. elegans+UPSl protein database using crux 
sequest - search followed by crux q- ranker, with 
the default parameters, crux spectral -counts was 
applied to the peptide-spectrum matches (PSMs) that 
received values < 0.01. The resulting data sets for the 
mouse and chicken replicates are summarized in Addi- 
tional file 1: Table SI, and the UP SI dilution curve data 
sets are summarized in Additional file 1: Table S2. 



Results 

Testing reproducibility between replicates 

To investigate the reproducibility of the four spectral 
count methods, we analyzed mass spectrometry data 
from technical and biological replicates from chicken and 
mouse samples. We then produced a scatter plot for each 
pair of biological or technical replicates and computed the 
corresponding Spearman correlation. For these compar- 
isons, proteins identified in only one of the two datasets 
were ignored. Figure 1 shows sixteen such plots, corre- 
sponding to one biological and one technical replicate 
for chicken and mouse, respectively. The complete collec- 
tion of 76 plots is provided as Additional file 1: Figures 
S1-S2. From these analyses, as summarized in Table 2, 
we draw two primary conclusions. First, the spectral 
counts are generally reproducible: the mean correlation 
value across all 76 pairs is 0.867, and the minimum 
correlation is 0.719. Second, reassuringly, the technical 
replicates produce higher correlations than the biological 
replicates: the mean correlation among 24 pairs of tech- 
nical replicates is 0.885, whereas the corresponding value 
for the 52 pairs of biological replicates is 0.859 (two-tailed 
Wilcoxon rank-sum test p-value=0.026). 

To test whether the observed differences in correla- 
tions among the four metrics are significant, we applied a 
Wilcoxon signed-rank test to paired sets of correlations. 
With four metrics, there are six possible paired compar- 
isons. Figure 2 shows the results of this analysis, where 
one metric attaining a significant increase (using a Bon- 
ferroni p-value of 0.05/6 = 0.008333) over another is 
indicated by a directed edge. From this graph we conclude 
that, for the biological and technical replicates, NSAF 
yields significantly more reproducible quantification val- 
ues than Sljv, dNSAF and emPAL Our reproducibility 
results agree with Colaert et al., who claim that NSAF 
is more reproducible than SI^v and emPAI [21]. How- 
ever, in contrast to our results, Griffen et al. report bet- 
ter reproducibility across replicates for SI/v compared to 
NSAF [3]. 

Testing linear response for protein abundance across 
samples 

To determine the linear response of each of the spectral 
count metrics, we analyzed mass spectra from a dataset of 
samples that form a dilution curve of forty-eight proteins 
with known amounts spiked into a C. elegans lysate. We 
performed linear regression between each protein spec- 
tral count and the associated amounts across the dilution 
curve samples. For this analysis, we only included pro- 
teins that obtain a positive spectral count in three or more 
of the data sets, which results in a comparison of forty- 
two proteins across the four metrics. We then carried 
out a Wilcoxon signed rank test analysis separately on 
the average correlation, R 2 , and the mean percent error 
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Figure 1 Reproducibility of spectral counts across biological and technical replicate experiments. Each plot compares either the Sl/v, emPAl, 
NSAF or dNSAF measure for proteins that were reproducibly identified across two replicate experiments. For visualization purposes, the counts are 
plotted on a logarithmic scale. The number in the lower right corner of each panel is the corresponding Spearman correlation and the number in 
the upper left is the number of datapoints compared. 



(MPE). The results of these tests (Figure 3) are fairly con- 
sistent with one another: NSAF significantly outperforms 
dNSAF, and dNSAF and SI/v significantly outperform 
emPAI. 

Colaert et al. (2011) claim that SI/v is more accurate than 
both NSAF and emPAI [21], but we find evidence only to 
support the former claim, even though our experiments 
employ a wider dynamic range of protein abundance 
(6.7-20 fmol versus 6-870 fmol) and more data sets (two 
versus eight). Based on our experiments, we conclude that 
NSAF or SIn are the methods of choice for ensuring an 
accurate linear response between a proteins change in 
abundance across different samples. 

It is worth noting that Griffin et al. (2010) observe a 
good linear fit between SI^ and protein quantification. 



However, their evaluation methodology fits a single line to 
all of the SI/v values from many proteins, whereas we have 
fit a separate line for each protein. This difference reflects 
our belief that spectral counting methods are most useful 
as measures of the relative abundance of a single pro- 
tein between two experiments. We did not test the claim 
that SI/v provides an accurate absolute protein abundance 
metric. 

Discussion 

Overall, our experiments suggest a relative ordering of 
spectral counting methods according to their repro- 
ducibility and the linearity of their response, but we can 
only speculate as to the reasons for the ranking that we 
observe. For example, we note that NSAF outperforms 
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Table 2 Spectral-counting reproducibility performance on 
mouse and chicken replicates 



Metric 


Technical 


Biological 


All Replicates 


Sl/v 


0.885 


0.848 


0.859 


emPAl 


0.870 


0.858 


0.862 


NSAF 


0.899 


0.876 


0.884 


dNSAF 


0.886 


0.852 


0.863 


All Metrics 


0.885 


0.859 


0.867 



This table summarizes the average correlation of the spectral-counting metrics 
across the technical and biological replicates. 



the emPAI metric in both of our experiments. The emPAI 
measure takes into account the least information— not 
only does it ignore fragment ion intensities, but emPAI 
also fails to account for the length of the protein. Appar- 
ently, this relatively simple approach is insufficient to 
accurately estimate protein abundance. The relative per- 
formance of NSAF and SI/v, on the other hand, is less 
clear: NSAF yields more reproducible results than SI/v but 
the two methods are statistically indistinguishable with 
respect to linearity. The main difference between SI^v and 
the other three metrics is that SI^ is the only metric 
that takes into account the intensities of the fragment ion 
peaks. In this sense, SI/v goes a bit beyond the strict def- 
inition of "spectral counting." Our experiments do not 
support the claim that such intensity information is valu- 
able for quantification. However, the conflicting results of 
our study and Collaert et al, on the one hand, versus Grif- 
fin et al. on the other hand, suggests perhaps that further 
comparison of these methods is warranted. 

An additional direction for future work involves quan- 
tifying the linearity and reproducibility of proteins in a 
segregated fashion according to protein abundance. For 
example, visual inspection of Figure 1 suggests that per- 
haps the SI/v measure yields more reproducible counts for 
high abundance proteins, with a corresponding decrease 
in reproducibility as the abundance drops. Arguably, in 




Figure 2 Comparison of spectral counts across replicates. This 
graph summarizes the statistical analysis of the reproducibility 
measurements. An edge leading out from node A to node B indicates 
a statistically significant improvement in reproducibility for method A 
relative to method B. 




Figure 3 Comparison of spectral counts across UPS1 dilution 
curve. This graph summarizes the statistical analysis of the linearity 
measurements. Two types of analysis were performed, using the 
linear regression correlation, R 2 and mean percent error (MPE) for the 
C. elegons + UPS1 dilution curve dataset. An edge leading out from 
node A to node B indicates a statistically significant improvement in 
linearity for method A relative to method B. 



many studies, such low abundance proteins are of the 
greatest interest; hence, it may be worthwhile to investi- 
gate in a systematic fashion the extent to which either the 
linearity or the reproducibility of a given spectral counting 
measure varies as a function of protein abundance. 

Conclusions 

Quantifying protein amounts in mass spectrometry by 
spectral counting is a simple and robust method for 
measuring the relative change of protein amounts across 
different samples; however, many different algorithms 
exist for assigning a score to each identified protein. 
Using crux spectral -counts, we compared and 
contrasted four spectral counting methods with respect 
to their reproducibility across replicates and their linear 
response relative to protein abundance. Crux provides 
a flexible, easy to use open source tool for performing 
protein quantification using spectral counting. 

Availability and requirements 

Project name: Crux tandem mass spectrometry analysis 
software 

Project home page: http://noble.gs.washington.edu/ 
proj/crux 

Operating systems: Linux, MacOS, Windows + Cygwin 
Programming language: C++ 

Other requirements: Crux has no requirements to install 
the binary version under Linux or MacOS. On Windows, 
Crux requires Cygwin. To compile Crux requires a C++ 
compiler, cmake, and Subversion. 
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License: Apache 

Any restrictions to use by non-academics: None 
Additional file 



Additional file 1: Supplementary Information. Supplementary Tables 1 
and 2 and Suplementary Figures 1 and 2 are provided as quantify- 
supplement.pdf. 
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